Chapter 2 Web Usage Data Pre - processing

نویسندگان

  • Juan D. Velásquez
  • Juan D. Velasquez
چکیده

End users leave traces of behavior all over the Web all times. From the explicit or implicit feedback of a multimedia document or a comment in an online social network, to a simple click in a relevant link in a search engine result, the information that we as users pour into the Web defines its actual representation, which is independent for each user. Our usage can be represented by different sources of data, for which different collection strategies must be considered, as well as the merging and cleaning techniques for Web usage data. Once the data is properly preprocessed, the identification of an individual user within the Web can be a complex task. Understanding the whole life of a user within a session in a Web site and the path that was pursued involves advanced data modeling and a set of assumptions which are modified every day, as new ways to interact with the online content are created. The objective is to understand the behaviour and preferences of a web user, also when several privacy issues are involved, which, as of today, are not clear how to be properly addressed. In this chapter, all previous topics regarding the processing of Web usage data are extensively discussed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Traversal Pattern Mining in Web Usage Data

Web usage mining is to discover useful patterns in the web usage data, and the patterns provide useful information about the user’s browsing behavior. This chapter examines different types of web usage traversal patterns and the related techniques used to uncover them, including Association Rules, Sequential Patterns, Frequent Episodes, Maximal Frequent Forward Sequences, and Maximal Frequent S...

متن کامل

Chapter 12: Web Usage Mining 12.1 Data Collection and Pre-processing 12.1.1 Sources and Types of Data Data Collection and Pre-processing 453 Fig. 3. Portion of a Typical Server Log

With the continued growth and proliferation of e-commerce, Web services, and Web-based information systems, the volumes of clickstream and user data collected by Web-based organizations in their daily operations has reached astronomical proportions. Analyzing such data can help these organizations determine the lifetime value of clients, design cross-marketing strategies across products and ser...

متن کامل

Pre Processing of Web Logs – An Improved Approach For E-Commerce Websites

In this paper an improved approach for pre processing of web logs data has been proposed and evaluated so that it can be applied for web logs of e-commerce web sites. The resultant web log data after these pre processing steps can be used for further pattern discovery and analysis that helps to provide useful prediction to enhance e-commerce. Ideally, the input for the Web Usage Mining process ...

متن کامل

Preprocessing on Web Server Log Data for Web Usage Pattern Discovery

World Wide Web has gained popularity because of the fact that it acts as an effective communication medium between business and end users. Company needs to have a web site which satisfies the intended needs of their end users. Users like to revisit a web site which is usable in nature. Web usage patterns of end users must be identified to improve usability on any web site. It is done with analy...

متن کامل

An Efficient Approach to Perform Pre-processing

Nowadays, WWW (World Wide Web) becomes more popular and user friendly for transferring information. Therefore people are more interested in analyzing log files which can offer more useful insight into web site usage. Web usage mining is one of the data mining fields, which deals with the discovery and extract useful information from web logs. There are three phases in web usage mining, preproce...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012